2,130 research outputs found
Smart City Analytics: Ensemble-Learned Prediction of Citizen Home Care
We present an ensemble learning method that predicts large increases in the
hours of home care received by citizens. The method is supervised, and uses
different ensembles of either linear (logistic regression) or non-linear
(random forests) classifiers. Experiments with data available from 2013 to 2017
for every citizen in Copenhagen receiving home care (27,775 citizens) show that
prediction can achieve state of the art performance as reported in similar
health related domains (AUC=0.715). We further find that competitive results
can be obtained by using limited information for training, which is very useful
when full records are not accessible or available. Smart city analytics does
not necessarily require full city records.
To our knowledge this preliminary study is the first to predict large
increases in home care for smart city analytics
The optimal legal retirement age in an OLG model with endogenous labour supply
The long run welfare implications of the legal retirement age are studied in a perfect foresight overlapping-generations model where agents live for two periods. Agents’ lifetime is divided between working life and retirement by a legal retirement age controlled by the government whereas agents, besides savings, control the intensive margin or "yearly" labour supply. The legal retirement age is utilized to dampen distortionary effects of payroll taxes and public pension annuities and promote capital accumulation. We show that a social optimal legal retirement age exists and how it depends on whether payroll taxes or benefit annuities ensures budget balance of the PAYG pension system.Optimal legal retirement age; pay-as-you-go-pension systems; overlapping-generations model
Sequence Modelling For Analysing Student Interaction with Educational Systems
The analysis of log data generated by online educational systems is an
important task for improving the systems, and furthering our knowledge of how
students learn. This paper uses previously unseen log data from Edulab, the
largest provider of digital learning for mathematics in Denmark, to analyse the
sessions of its users, where 1.08 million student sessions are extracted from a
subset of their data. We propose to model students as a distribution of
different underlying student behaviours, where the sequence of actions from
each session belongs to an underlying student behaviour. We model student
behaviour as Markov chains, such that a student is modelled as a distribution
of Markov chains, which are estimated using a modified k-means clustering
algorithm. The resulting Markov chains are readily interpretable, and in a
qualitative analysis around 125,000 student sessions are identified as
exhibiting unproductive student behaviour. Based on our results this student
representation is promising, especially for educational systems offering many
different learning usages, and offers an alternative to common approaches like
modelling student behaviour as a single Markov chain often done in the
literature.Comment: The 10th International Conference on Educational Data Mining 201
Neural Speed Reading with Structural-Jump-LSTM
Recurrent neural networks (RNNs) can model natural language by sequentially
'reading' input tokens and outputting a distributed representation of each
token. Due to the sequential nature of RNNs, inference time is linearly
dependent on the input length, and all inputs are read regardless of their
importance. Efforts to speed up this inference, known as 'neural speed
reading', either ignore or skim over part of the input. We present
Structural-Jump-LSTM: the first neural speed reading model to both skip and
jump text during inference. The model consists of a standard LSTM and two
agents: one capable of skipping single words when reading, and one capable of
exploiting punctuation structure (sub-sentence separators (,:), sentence end
symbols (.!?), or end of text markers) to jump ahead after reading a word. A
comprehensive experimental evaluation of our model against all five
state-of-the-art neural reading models shows that Structural-Jump-LSTM achieves
the best overall floating point operations (FLOP) reduction (hence is faster),
while keeping the same accuracy or even improving it compared to a vanilla LSTM
that reads the whole text.Comment: 10 page
Modelling Sequential Music Track Skips using a Multi-RNN Approach
Modelling sequential music skips provides streaming companies the ability to
better understand the needs of the user base, resulting in a better user
experience by reducing the need to manually skip certain music tracks. This
paper describes the solution of the University of Copenhagen DIKU-IR team in
the 'Spotify Sequential Skip Prediction Challenge', where the task was to
predict the skip behaviour of the second half in a music listening session
conditioned on the first half. We model this task using a Multi-RNN approach
consisting of two distinct stacked recurrent neural networks, where one network
focuses on encoding the first half of the session and the other network focuses
on utilizing the encoding to make sequential skip predictions. The encoder
network is initialized by a learned session-wide music encoding, and both of
them utilize a learned track embedding. Our final model consists of a majority
voted ensemble of individually trained models, and ranked 2nd out of 45
participating teams in the competition with a mean average accuracy of 0.641
and an accuracy on the first skip prediction of 0.807. Our code is released at
https://github.com/Varyn/WSDM-challenge-2019-spotify.Comment: 4 page
The diffusion of health technologies: Cultural and biological divergence
This paper proposes the hypothesis that genetic distance to the health frontier influences population health outcomes. Evidence from a world sample suggests that genetic distance - interpreted as long-term cultural and biological divergence - is an important factor in understanding health inequalities across countries. In particular, the paper documents a remarkably robust link between genetic distance and health as measured by life expectancy at birth and the adult survival rate. Also, the evidence reveals that the link has strengthened considerably over the 20th century which highlights the increasing effects of globalization on health conditions across countries through the transmission of health technologies.Population health; international diffusion of health technologies; globalization; cultural and biological divergence
Unsupervised Semantic Hashing with Pairwise Reconstruction
Semantic Hashing is a popular family of methods for efficient similarity
search in large-scale datasets. In Semantic Hashing, documents are encoded as
short binary vectors (i.e., hash codes), such that semantic similarity can be
efficiently computed using the Hamming distance. Recent state-of-the-art
approaches have utilized weak supervision to train better performing hashing
models. Inspired by this, we present Semantic Hashing with Pairwise
Reconstruction (PairRec), which is a discrete variational autoencoder based
hashing model. PairRec first encodes weakly supervised training pairs (a query
document and a semantically similar document) into two hash codes, and then
learns to reconstruct the same query document from both of these hash codes
(i.e., pairwise reconstruction). This pairwise reconstruction enables our model
to encode local neighbourhood structures within the hash code directly through
the decoder. We experimentally compare PairRec to traditional and
state-of-the-art approaches, and obtain significant performance improvements in
the task of document similarity search.Comment: Accepted at SIGIR'2
- …